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METHOD FOR TRAINING A NEURAL NETWORK, METHOD FOR THE 
CLASSIFICATION OF A SEQUENCE OF INPUT QUANTITIES UPON 
EMPLOYMENT OF A NEURAL NETWORK, NEURAL NETWORK AND 
ARRANGEMENT FOR THE TRAINING OF A NEURAL NETWORK 

The invention is directed to a method for training a neural network, to a 
method for the classification of a sequence of input quantities upon employment of a 
neural network as well as to a neural network and an arrangement for training a neural 
network. 

A neural network comprises neurons that are at least partially connected to 
one another. Input neurons of the neural network are supplied with input signals as 
input quantities supplied to the input neurons. The neural network usually comprises 
a plurality of layers. A respective neuron generates a signal dependent on input 
quantities supplied to a neuron of the neural network and on an activation function 
provided for the neuron, said signal being in turn supplied to neurons of a further 
layer as input quantity according to a prescribable weighting. An output quantity 
dependent on quantities that are supplied to the output neuron of neurons of the 
preceding layer is generated in an output neuron in an output layer. There are 
currently essentially two approaches in view of the questions as to the form in which 
information is stored in a neural network. 

A first approach assumes that the information in a neural network is 
encoded in the spectral domain. Given this approach, a chronological sequence of 
input quantities is encoded such that a respective input neuron is provided for each 
time row value of a chronological sequence of the input quantities, the respective time 
row value being applied to this input neuron. 

Given a neural network that is designed according to this approach, a 
hyperbolic tangent (tanh) is usually employed as activation function. 

This first type of neural network is referred to below as static neural 

network. 



What is particularly disadvantageous about this approach is that it is not 
possible with a static neural network to explicitly consider a dynamics of a process 
subject to a technical system in the internal coding of the sequence of input quantities. 

The Time Delay Neural Networks (TDNN) known from [4] attempt to 
counter this disadvantage in that, given a plurality of sequences of input quantities, a 
respective input neuron is provided for each sequence and for each time row value. 
This approach particularly exhibits the disadvantage that the dimension of the input 
space — represented by the plurality of input neurons -- increases exponentially with 
an increasing plurality of different sequences of input quantities to be taken into 
consideration. 

An increasing plurality of neurons in the neural network, moreover, 
involves an increased training outlay upon employment of a plurality of training data 
that increases with an increasing plurality of neurons. A training of a static neural 
network becomes highly calculation-intensive under these conditions or, respectively, 
can practically no longer be implemented. 

A gradient-based training method, for example the back-propagation 
method, is usually utilized for training a static neural network. 

[3] also discloses a training method for a static neural network that is 
referred to as the ALOPEX method. In this method, the training of a static neural 
network is viewed as an optimization problem. In this case, the goal of the 
optimization is a minimization of a error criterion E taking weightings that are present 
in the static neural network and with which the connections between neurons are 
weighted into consideration for a predetermined training data set with training data. 

A training datum is a tuple that [...] input quantities, for example state 
quantities of a technical system or, respectively, boundary conditions that a technical 
system is subject to and that are supplied to a technical system as well as an output 
quantity determined under the boundary conditions and that the technical system 
forms for the input quantities. 

The ALOPEX method shall be explained in greater detail later in 
conjunction with the exemplary embodiment. 


A second approach can be seen therein that the information about a system 
is encoded in the time domain and in the spectral domain. An artificial neural 
network that does justice to this approach comprises what are referred to as pulsed 
neurons and is known from [2]. 

According to [1], a pulsed neuron is modelled such that the behavior of a 
pulsed neuron with respect to an external stimulation, which is referred to below as 
input quantity, is described by a stochastic differential equation of the Ito type 
according to the following rule: 

( V(t) } 

dV(t) = — + /J dt + cn!W(t) + wdS(t) . ( 1 ) 

\ T J 

In the rule (1), dW(t) references a standard Wiener process. A 
predetermined constant x describes a delay of a membrane potential V(t) of the 
modelled neuron without input quantity that is adjacent at the neuron. The model 
simulates the behavior of a biological neuron. For this reason, a pulsed neuron is also 
referred to as biologically oriented neuron. 

Further, S(t) references a coupling of the neuron with another neuron, i.e. 
the following applies: 

s(t) = £s(t) = £ J(t - ti), <2> 
l 

whereby t { references an arrival time at which an external impulse arrives at an input 
of a neuron. A soma-synaptic intensity is modelled by a synaptic quantity w. 

In this model, the pulsed neuron generates a pulse when the membrane 
potential V(t) reaches a predetermined threshold 0. After the pulse is generated, the 
membrane potential V(t) of the neuron is reset to a predetermined initialization 
potential value V(0). 

A time sequence of pulses is thus described according to the following 

rule: 


(3) 


and satisfies the following rule: 
o(t) = £ <*(t - tk) . 


(4) 


It is also known from [1] that, given the assumption of the above- 
described model for a pulsed neuron, a discrimination value I(T) can be formed that 
indicates the dependability with which a sequence of input quantities is correctly 
classified in view of the training data employed for a training of the neural network. 

The discrimination value I(T) is dependent on pulses that are formed by 
the pulsed neurons within a time span [0; T] as well as on a training sequence of input 
quantities that are supplied to the neural network. The discrimination value I(T) 
satisfies the following rule: 


I(T) = I 


L(l) t (l) t (D t (2) t (2) t (2) 
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whereby 

• s references the input quantities, 

references a pulse that is generated by a pulsed neuron n at a time m 

within a time span [0, T], 

k„ (n = 1, N) references a point in time at which the pulsed neuron n 
has generated the last pulse within the time span [0, T], 
N references a plurality of pulsed neurons contained in the neural network. 
A stochastic differential equation of the ltd type derives for a neural 
network with a plurality of N neurons described according to the following rule: 


dVi(t) = + \xjdt + adWi(t) + 

+ £ wijZ s(t - A ..]dt + ii(t)dt 


(6) 
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whereby 

• V s (t) references a membrane potential of the i* neuron (i = 1, N), 
N references a plurality of neurons contained in the neural network, 

Wy respectively references a weighting of a coupling between the i* and 
the j* neuron, clearly a synaptic intensity between the neurons i and j, 
Ay references a prescribable axional delay of a signal between the neurons 
i and j, 

• I t (t) references an external stimulation signal of the neuron i. 

[4] discloses a training method for a neural network. Given this method, 
the neural network is linked such in a control circuit with the model of a technical 
system that the neural network outputs at least one manipulated variable to the model 
as output quantity, and the model generates at least one regulating variable from the 
manipulated quantity supplied by the neural network, said at least one regulating 
variable being supplied to the neural network as input quantity. The regulating 
variable is superimposed with a noise having a known noise distribution before it is 
supplied to the model. As a reaction to the regulating variable modified by the 
impressed noise, the weightings of the neural network are set as follows: A cost 
function evaluates whether the change in weighting at the network has effected an 
improvement of the regulating variable with respect to a rated behavior of the model, 
and such weightings are favored by the cost function. 

The invention is based on the problem of specifying a method as well as 
an arrangement for training a neural network having pulsed neurons. The invention is 
also based on the problem of specifying a method for the classification of a sequence 
of input quantities upon employment of a neural network having pulsed neurons as 
well as specifying a neural network having pulsed neurons. 

The problems are solved by the methods and the arrangement as well as 
by the neural network having the features of the independent patent claims. 

A method for training a neural network that contains pulsed neurons 
comprises the following steps: 


a) the neural network is trained such for a first time span that a 
discrimination value is maximized, as a result whereof a maximum first 
discrimination value is formed; 

b) the discrimination value is formed dependent on pulses that are formed by 
5 the pulsed neurons within the first time span as well as on a training 

sequence of input quantities that are supplied to the neural network; 

c) the following steps are interactively implemented: 

— the first time span is shortened to form a second time span, 

— a second discrimination value is formed for the second time span, 
10 — when the second discrimination value is the same as the first 

discrimination value, then a new iteration ensues with a new second time 
span that is formed by shortening the second time span of the preceding 
iteration, 

— otherwise, the method is ended and the trained neural network is the 
1 5 neural network of the last iteration wherein the second discrimination 

value is the same as the first discrimination value. 

A method for the classification of a sequence of input quantities upon 
employment of a neural network that contains pulsed neurons and was trained 
according to the following steps comprises the following steps: 
2 0 a) the neural network is trained such for a first time span that a 

discrimination value is maximized, as a result whereof a maximum first 

discrimination value is formed; 

b) the discrimination value is formed dependent on pulses that are formed by 
the pulsed neurons within the first time span as well as on a training 

2 5 sequence of input quantities that are supplied to the neural network; 

c) the following steps are interactively implemented: 

— the first time span is shortened to form a second time span, 

— a second discrimination value is formed for the second time span, 

— when the second discrimination value is the same as the first 

3 0 discrimination value, then a new iteration ensues with a new second time 


span that is formed by shortening the second time span of the preceding 
iteration, 

— otherwise, the method is ended and the trained neural network is the 
neural network of the last iteration wherein the second discrimination 
value is the same as the first discrimination value, 

~ the sequence of input quantities is supplied to the neural network; 
d) a classification signal is formed that indicates what kind of sequence of 

input quantities the supplied sequence is. 

A neural network that contains pulsed neurons has been trained according 
to the following steps: 

a) the neural network is trained such for a first time span that a 
discrimination value is maximized, as a result whereof a maximum first 
discrimination value is formed; 

b) the discrimination value is formed dependent on pulses that are formed by 
the pulsed neurons within the first time span as well as on a training 
sequence of input quantities that are supplied to the neural network; 

c) the following steps are interactively implemented: 

— the first time span is shortened to form a second time span, 

~ a second discrimination value is formed for the second time span, 

— when the second discrimination value is the same as the first 
discrimination value, then a new iteration ensues with a new second time 
span that is formed by shortening the second time span of the preceding 
iteration, 

— otherwise, the method is ended and the trained neural network is the 
neural network of the last iteration wherein the second discrimination 
value is the same as the first discrimination value. 

An arrangement for training a neural network that contains pulsed neurons 
comprises a processor that is configured such that the following steps can be 
implemented: 
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a) the neural network is trained such for a first time span that a 
discrimination value is maximized, as a result whereof a maximum first 
discrimination value is formed; 

b) the discrimination value is formed dependent on pulses that are formed by 
5 the pulsed neurons within the first time span as well as on a training 

sequence of input quantities that are supplied to the neural network; 

c) the following steps are interactively implemented: 

— the first time span is shortened to form a second time span, 

— a second discrimination value is formed for the second time span, 
10 ~ when the second discrimination value is the same as the first 

discrimination value, then a new iteration ensues with a new second time 
span that is formed by shortening the second time span of the preceding 
iteration, 

« otherwise, the method is ended and the trained neural network is the 
1 5 neural network of the last iteration wherein the second discrimination 

value is the same as the first discrimination value. 
The invention makes it possible to classify a time sequence of input 
quantities with a neural network what contains pulsed neurons, whereby it is assured 
that, given optimized classification dependability, a minimized plurality of time 
2 0 values must be supplied to the neural network for classification. 

Preferred developments of the invention derive from the dependent claims. 
An optimization method that is not gradient based is preferably employed 
for the maximization of the first discrimination value and/or the second discrimination 
value, preferably an optimization method based on the ALOPEX method. 
2 5 The first discrimination value preferably satisfies the following rule: 


I(T) = I 


f ' a) t a) t d) t (2) t (2) t (2) n 

\-\ / »-• / t-m / •••/ ' 1 /---/t-rti'*"' )C2 ' *" " ' I 
>) t (n) t (n) fa) t (N) t (N) 


s; 


(7) 


whereby 

• s references the input quantities, 


references a pulse that is generated by a pulsed neuron n at a time m 
within a time span [0, T], 

k,, (n = 1, N) references a point in time at which the pulsed neuron n 
has generated the last pulse within the time span [0, T], 

• N references a plurality of pulsed neurons contained in the neural network. 

In a further development, the first discrimination value satisfies the 

following rule: 


I(T) = -J p(out) • ln(ip(out))dtp... dtj£... dtj^J + 

fpjj p(ou^)) ■ ln (p(out|se)))dt< 1 >... d t«... d t<») 


(8) 


+ 

j=l 


with 


p(out) = 2- PjP 

(outjsO)), (9) 


whereby 
10 timej, 


s^ * references an input quantity that is applied to the neural network at a 


• pj references a probability that the input quantity s^ is applied to the 
neural network at a pint in time j, 

• p(out| s^ ^) references a conditioned probability that a pulse is generated 
by a pulsed neuron in the neural network under the condition that the input 

1 5 quantity s^ is applied to the neural network at a point in time j . 

The training sequences of input quantities are preferably measured 
physical signals. 

The methods and the arrangements can thus be utilized in the framework 
of the description of a technical system, particularly for describing or, respectively, 
2 0 investigating a multi-channel signal that has been registered by an 
electroencephalograph and describes an electroencephalogram. 


10 

The methods and the arrangements can also be utilized for the analysis of 
multi-variant financial data in a financial market for the analysis of economic 
relationships. 

The described method steps can be realized both in software for the 
5 processor as well as in hardware, i.e. with a specific circuit. 

An exemplary embodiment of the invention is shown in the Figures and is 
explained in greater detail below. 
Shown are: 

Figure 1 a flowchart wherein the individual method steps of the exemplary 
£j 10 embodiment are presented; 

SJ Figure 2 a sketch of an electroencephalograph and a patient for whom a 

cn - ? 

[y electroencephalogram is produced; 

I* 4 , Figure 3 a sketch of a neural network according to the exemplary embodiment; 



fU Figure 4 a sketch on the basis whereof the principle underlying the exemplary 

5 r ■ " 

p 15 embodiment is shown. 

:H Figure 2 shows a patient 200 to whose head 201 sensors 202, 203, 204, 

Cn 205 and 206 are attached for the registration of brain stomata [sic; should probably 

read "currents"]. Electrical signals 207, 208, 209, 210 and 211 picked up by the 
sensors 202, 203, 204, 205, 206 are supplied to an electroencephalograph 220 via a 
2 0 first input/output interface 221 . The electroencephalograph 220 comprises a plurality 
of input channels. Via the input/output interface 221, which is connected to an 
analog-to-digital converter 222, the electrical signals are supplied to the 
electroencephalograph 220 and digitalized in the analog-to-digital converter 222, and 
each registered electrical signal is stored in a memory 223 as a sequence or time row 

2 5 values. 

A sequence of time row values is thus characterized by a sampling interval 
as well as by a time duration, referred to below as time span, during which a 
respective electrical signal is registered. The memory 223 is connected to the analog- 
to-digital converter 222 as well as to a processor 224 and a second input-output 

3 0 interface 225 via a bus 226. 
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A picture screen 228 (via a first cable 227), a keyboard 230 (via a second 
5&ble 229) and a computer mouse 232 (via a third cable) are also connected to the 
second input/output interface 225. 

Results of the examination of the patient 200 are shown on the picture 
5 screen 228. A user (not shown) can make inputs into the system via the keyboard 230 
or, respectively, the computer moose 232. 

The processor 224 is configured such that the method steps described later 
can be implemented. 

A respective sequence of time row values as well as a particular about the 
10 class of time row values to which the sequence of time row values is to be allocated 
form a training datum. 

A plurality of training data form a training data set with which a neural 
network 301 described later is trained. 

Figure 3 shows the neural network 301 with pulsed neurons. 
15 A sequence 306, 307, 308 of time row values is respectively applied to a 

respective input neuron 302, 303, 304 of an input layer 305. A particular as to 
whether the sequence 306, 307, 308 of the time row values, broadly referred to as 
input pattern 306, 307, 308, is a matter of an input pattern 306, 307, 308 of a first 
class or a matter of an input pattern 306, 307, 308 of a second class is allocated to 
2 0 each applied sequence 306, 307, 308 of time row values in the framework of the 
training method. 

A respective input neuron 302, 303, 304 is respectively connected to an 
intermediate neuron 309, 310, 31 1 of an intermediate layer 312 via a weighted 
connection 313, 314, 315. 

2 5 The intermediate neurons 309, 3 10, 3 1 1 are connected to one another via 

connections 316, 317, 318, 319, 320, 321 that are likewise weighted. 

The intermediate neurons 309, 310, 31 1 are also connected to further 
pulsed neurons 322, 323, 324 via weighted connections 325, 326, 327, 328, 329 and 
330. 

3 0 The pulsed neurons respectively comprise the above-described behavior 

that is presented in [2], 
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The intermediate neurons 309, 310, 3 1 1 are connected to a plurality of 
intermediate neurons 309, 310, 31 1; the respective, further pulsed neurons 322, 323, 
324 are respectively connected to exactly one intermediate neuron 309, 310, 31 1 . In 
this way, it is possible to model a far-reaching influencing between neurons of a 
neural network as well as a local influencing of neurons within the neural network. 

An output neuron 331 is connected to the further pulsed neurons 322, 323, 
324 via weighted connections 332, 333 and 334. The output neuron 331 forms an 
output signal 335 that indicates the class to which the input pattern 306, 307, 308 
belongs. 

In the training phase of the neural network 301, the output quantity 335 is 
compared to the classification particular allocated to the respective input pattern, and 
an error signal E is formed that is employed for adapting the weightings of the 
connections between the neurons present in the neural network 301. 

The method according to the ALOPEX method, which is not gradient 
based, is utilized as the training method in the framework of this exemplary 
embodiment. The goal of the ALOPEX method is the minimization of an error 
criterion E taking into consideration and adapting the weightings w bc for a training 
dataset. 

The ALOPEX is explained in greater detail below. 

A neuron b is connected to a neuron c via a connection that is weighted 
with the weighting w^. During an f 0 * iteration, the weighting w^ is updated according 
to the following rule: 

w bc(f) = w bc(f - 1) + *bc( f )' (10) 

whereby 5^ (f) references a small positive or negative, predetermined step width 5 
according to the following rule: 

_ f-tf with o probability Pbc( f ) 
<*bcW - uilth a probability x _ Pbc (f) * 


• 
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A probability p,*. (f) is formed according to the following rule: 

Pbc(f) = ^PT 1121 


1 + e 


T(f) 


whereby (f) is formed according to the following rule: 
C bc (f) = Aw bc (f) . AE(f). (13) 

T(f) references a prescribable value. Aw bc (f) and AE(f) reference the weighting 
changes Aw bc (f) of the weightings w^ or, respectively, the change AE(f) of the error 
criterion during the preceding two iterations according to the rules: 

Aw bc (f) = w bc (f - 1) + w bc (f - 2), (14) 
AE bc (f) = E bc (f - 1) + E bc (f - 2) . (15) 

The predetermined value T(f) is updated every F iterations according to the following 
rule: 

*) = ^SI f £p hc (r)\ d6) 

be f=f-F 
when f is a whole multiple of F, and 
T(f) = T(f - l) otherujise, (17) 

whereby M references a plurality of connections in the neural network 301 . 
Equation (16) can be simplified to form the following rule: 

*f) = | f £W oi • (is) 

f = f-F 

The neural network 301 is trained according to the above-described 
training method upon employment of the training dataset. 
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Further, a first discrimination value I(T) for the neural network 301 is 
formed according to the following rule: 


I(T) = I 


whereby 


t (l) , t (l) t (D t (2) fc (2) t (2) 
>) fc (n) t (n) (N) (N) t (N> 


(19) 


• s references the input quantities, 

references a pulse that is generated by a pulsed neuron n at a time m 

within a time span [0, T], 

• k,, (n = 1, N) references a point in time at which the pulsed neuron n 
has generated the last pulse within the time span [0, T], 

• N references a plurality of pulsed neurons contained in the neural network. 
The first discrimination value I(T) clearly corresponds to the difference of 

the following entropies: 


I(T) = H(out) - (H(o4t|s)) , 


(20) 


with 


H(out) = -J p(out) ln(p(out))dtP... dtj^... dtj^ 


(21) 


Qnd 


(H(out|s)) s = 


- £ Pij p(out|s(3)) l^outjs^dtft'... dt«... dt<»> 


(22) 


The first discrimination value I(T) thus derives according to the following 

rule: 
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I(T) = -J p(out) - IhCptout^itP... dtJJ... dtj^ + 

± pj p(oWt|s(3)) • In^outp)))^... dtj» .. dt» 


(23) 


+ 


with 


p(out) = £ Pj p(out|sG)) , (24) 


j = l 


whereby 


• s^ references an input quantity that is applied to the neural network at a 
time j, 

• pj references a probability that the input quantity is applied to the 
neural network at a pint in time j, 

• p(out| s^ ^) references a conditioned probability that a pulse is generated 
by a pulsed neuron in the neural network under the condition that the input 
quantity s^ is applied to the neural network at a point in time j. 

When a maximum first discrimination value I(T) has been determined in 
the framework of training the neural network 301 , then this means that the input 
pattern 306, 307, 308 observed in the first time span contains enough information in 
order to classify the input pattern with adequate dependability. 

The first discrimination value I(T) is clearly formed (step 101) in the 
framework of the training for a first time span [0; T] (see Figure 1) . 

In a further step (step 102), a second time span is formed by shortening the 
first time span: [0; T'], whereby T' < T applies. 

For the secojid time span [0; T'], a second discrimination value I(T') is 
formed in a further step/(step 103) in the same way as described above for the first 
discrimination value IC|). 

The first discrimination value I(T) is compared to the second 
discrimination value I(T') (step 104). 
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When the second discrimination value I(T') is the same as the first 
discrimination value I(T), then a new second time span is formed (step 105) by 
shortening the second time span [0; T'], and the new second time span is considered 
to be the second time span (step 106). A second discrimination value I(T') is in turn 
formed (step 103) for the second time span of the new iteration. 

Clearly, this iterative method means that the time span wherein pulses 
generated by the pulsed neurons are taken into consideration for forming the output 
signal is shortened until the second discrimination value I(T') is unequal to the first 
discrimination value I(T). 

When the second discrimination value I(T') is smaller than the first 
discrimination value, then the neural network 301 is viewed as being an optimized 
neural network that was trained in the last preceding iteration wherein the second 
discrimination value I(T') was not smaller than the first discrimination value I(T) 
(step 107). 

The time span respectively taken into consideration is divided into discrete 
time sub-spans for which the only thing respectively determined is whether a neuron 
generated a pulse during the time sub-span or not. 

In this way, the calculating outlay needed for the training is considerably 

reduced. 

For further illustration, the principle is explained again on the basis of 

Figure 4 . 

Figure 4 shows two continuous processes pi and p2 that are formed by a 
set of continuous input signals SI and S2. Two sequences of input quantities, the 
input patterns, are present after a corresponding, above-described digitalization. The 
input patterns are supplied to the trained neural network 401 in an application phase, 
and a space-time encoding of the processes pi, p2 is clearly implemented on the basis 
of the time rows for the trained neural network 401 . 

On the basis of an output signal 402, the trained neural network 401 
indicates the kind of process the input pattern involves. The trained neural network 
401 exhibits the property that, first, the dependability of the optimization is optimized 
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and, second, a minimum plurality of time row values, i.e. a minimum second time 
span 403, is required in order to dependably implement the classification. 

A few alternatives to the above-described exemplary embodiment are 
present below: 

The plurality of inputs, of pulsed neurons as well as output signals is 
generally arbitrary. The plurality of different sequences of time row values is also 
arbitrary in the framework of the classification and in the framework of the training. 
An electroencephalogram analysis is thus possible for an arbitrary plurality of 
channels for characterizing tumors. 
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